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ABSTRACT 


This report summarizes and documents the problems and modeling 
techniques associated with the reliability of integrated circuits. 

A general form of a comprehensive reliability modeling rationale 
was then formulated for the integrated circuit. 

It is increasingly important that integrated device failure models 
be amenable to long-term applications. Therefore a rationale was 
formulated which would be consistent with the research findings and 
this long-term constraint. 

The rationale which is formulated in this report is general in its 
formulation. It provides a basis for future research in the modeling 
of integrated circuit reliability. 
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Introduction 


0 . 0 

0. I Problem Statement 

In the past, it has been the practice to evaluate electronic 
component reliability in a rather simple straightforward way. 

This method of reliability assessment merely took the data accrued 
from large-scale life tests or in use data and applied prediction 
models based on classical statistics. With the advent of solid 
state physics and the subsequent integrated device technology, the 
task of assessing device reliability became increasingly more 
difficult. Let us consider transistors and discrete devices separate 
from integrated circuits and the more advanced integrated devices, 

1. e. large scale integration (LSI). The major point of differentiation 
between these two classes is that the parameters of integrated 
circuits and LSI devices are only readily accessible from the periphery. 
This is the first source of the reliability assessment problem. This 
problem is one which is mainly attributable to technological advance. 

It manifests itself in the lack of data needed for precise model defin- 
itions and constraints. 

The second problem source is with regard to existing 
modeling techniques. The majority of these can not be considered 
viable. This is because there is no direct input for basic materials 
and process changes which tend to be inherent in technological change. 

The application of existing models creates further concerns. 
This is particularly true when longer mission times are being con- 
sidered. This application deficiency originates from the assumptions 
forming the basis of present model derivations. 

The first of these assumptions is that of the constant failure 

rate. Suppose that a long mission time is considered and that 
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the commencement of "wear- out" is not known or estimated. 

There is then an incalculable error in the probability of success 
when a constant rate is applied to the entire time. 

There is a similar problem when one considers deter- 
ministic models. The assumption here is that mechanisms (micro- 
scopic effects) induce failure. The fault in this instance is that 
failures due to defect faults are not considered as life-limiting 
influences. Thus there is again a miscalculation of reliability. 


0. 2 Report Objective 

This paper is intended to be an interim report. Its concern 
is the research relating to the development of an integrated circuit 
reliability model. Because of the foregoing problem statement it 
was determined that an integrated and comprehensive modeling 
rationale should consider the following topics: 

1. Defects 

2. Modes V These are defined in 

f Section 1.3.2 


3. Mechanisms (where possible)J 

4. Frequency of mechanism occurrence (where possible) 

5. Manufacturing/processing impact on failure 

6. Time dependency of mechanisms 

7. Screening influences (supplier and user) 

8. Device application 


The objective prior to presenting the proposed rationale 
will be to attain some understanding of the technological impact on 
reliability and prior modeling techniques. This will include mention 
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of problem failure modes and mechanisms associated with IC , 

MSI, and LSI devices. The emphasis will be placed on generality 
so that further work can be done to extend IC assessment techniques 
to the more complicated MSI, LSI circuits. 
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Discuss ion 


1 . 0 

1. 1 Current Technological Impact on Reliability Assessment 

Before the problem of quantitative modeling and assessment 
is considered a brief understanding of discrete and basic silicon IC 
technology should be attained. It is especially important to discuss 
the technological impact on device reliability assessment. This is 
essential, for without this information, there is no way of knowing 
where the emphasis must be placed within the modeling scheme. 

This is especially true when failure mechanisms are considered. 

In order to retain the basic objective of modeling via basic 
IC manufacture knowledge and failure understanding some of the 
basic notions regarding integrated circuit failure mode and mechanisms 
are presented first. 

In the majority of applications undertaken to date, gross 
quality defects have tended to dominate IC failures. Therefore, this 
portion of this section will attempt to determine if a particular defect 
can be isolated as being the most frequent. What follows is an attempt 
to identify the major/minor catagories of defect related failures 
occurring in simple IC's. Information for this was gathered from the 
literature published by various parts manufacturers and users. In 
some cases the data was more substantial than others, and where 
possible the quantity and source of the data base will be reported. 

The data is presented in the following section. 
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1 . 1. 1 


Relative Failure Frequencies 


Autonetics (1) 

Data Base: 579 confirmed device failures, in-house 

Die Bonding 

C racked Die 

Faulty Bond from Preform to Die 
Faulty Bond from Preform to Case 
Misorientation of Die 
Insufficient Clearance 

Lead Bonding 

Separation of Bond Interconnect 

Separation of Bond from external lead 

Separation of Bond wire from neck of bond 

Improper Position of Bond 

Formation of Intermetallics 

Insufficient Contact Area 

Voids or Cracks in Bond 

Damaged Area in Silicon Under Bond 

Metallization 

Poor Adhes ion 

Improper Thickness 

Err os ion /Corrosion 

Mechanically damaged interconnect 

Voids in Metallization 

Improper Metallization 

Separation at passivation step 

Passivation Layer 

Holes 
C racks 

Non- passivated Area 
Improper Thickness 


Diffusion 


Diffusion process 
Poor alignment or masking 
Damaged mask 
Photolithographic fault 
Tool mark in oxide 
Manufacturing related 
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and field. 
2 7. 2% 


16 . 2 % 


14. 7% 


13. 3% 


12 . 9 % 



Other 


15. 7% 


Internal Lead 3. 8% 

Foreign Material 5.2% 

Package 2. 2% 

Dielectric 0. 2% 

Contamination 3.8% 

Material 0. 5% 

USAF, RADC (23) 

Wire and Bond failures 33% 

Metallization 2 6% 

Surface Problems 7% 

Photolithographic Defects 18% 

Package Defects 10% 

Miscellaneous 6% 

Motorola (26) 

Contacts 2 5% 

Metallization 5% 

Surface 30% 

Package 10% 

Hermeticity 10% 

Other 20% 

Texas Instruments (43) on all types TI monolithic IC's since '61 

Bonding 28. 8% 

Metallization 22. 9% 

Surface 34. 5% 

Des ign 5. 5% 

Bulk 2.7% 

Other 5. 8% 

Westinghouse (43) 

Bonding plague formation past 200° C 30% 

Metallization 35% 

Deterioration of interconnects 
Damaged interconnects 

Surface shorts through oxide layer 30% 

Foreign material 3% 

Other 2% 


RCA (43) in order of prevalence 

Bonding 

Surface 

Oxide growth 
Contamination of A1 films 
Metallization 
Purple plague 
Short through oxide 
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Electrical overstress 
Epitaxial growth 


( 9 ) 


Open TC Bonds 

24. 

0% 

Die to Header Bond Defects 

12. 

0% 

Metallization Defects 

12. 

0% 

Oxide Layer Defects 

32. 

0% 

Surface Contamination 

1 . 

5% 

Other 

18. 

5% 


Internal Lead Discrepancy 
Improper, faulty diffusion 
Non-hermetic Seals 

JPL Data 

JPL experience suggests that the basic failure dominance 
is ordered in the following way: 

Bonding 48. 03% 

General condition 
Wedge or die bonds 
Internal lead wires 
Foreign material 
Die mounting 

Oxide and Diffusion Faults 20.41% 

Overlayed passivation 
Lifted passivation 
Cracked passivation 
Voids in passivation 

Package Defects 11.79% 

General rejects 
Cracked packages 
Holes in package 
Voids in package 
Preform buildup 
Misaligned package 
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Metallization 


11. 33% 


Scratches 

Voids 

Corrosion 

Adherence 

Bridging 

Alignment 

Mask Defects 8.42% 

Masking faults 
Maskant misalignment 
Hole in mask 
Maskant undercut 
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1. 1.2 Analysis of Frequency Data 

It is clear from the data that the most frequent failure 
occurrences for IC's are those associated with bonding procedures, 
metallizations and surface effects. However, ranking these from 
what is stated in the literature is difficult if not impossible because 
of reporting inconsistencies. These inconsistencies stem from 
several reasons one of which, in this case, is the semantic problem 
involved in the way in which failures are described when reported. 

Any convenient overall consequences regarding relative defect 
frequency cannot be determined. However, it appears that bonding 
failures are most prevelant. The surface and metallization effects 
are next since each has a percentage advantage over other types of 
failures. 

The survey information for the discussion that follows 
came from various sources. Usually it is derived from failure 
analysis occurring in the following areas. 

1. Parts qualification test 

2. Screening 

3. Subsystem and Subsystem Tests 

Wright {45) summarized the causes of part failure with 
respect to three classifications. 

1. Gross quality defects 

2. Misuse 

3. Time /environment dependent mechanisms 

Quality defects may result from any or a combination of 
errors. These may include poor workmanship, operator error or out 
of control processes. Whether these are built-in (inherent in technology) 
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or not they usually result in catastrophic failure. While they may 
appear to occur at any time it is hoped that over greatly extended 
use periods defect influence will diminish after a specified period 
of time. Screening and testing are not 100% effective in reducing 
failure due to defects. However, Hi-Rel inspection and testing 
does reduce the frequency of failure occurrence. 

Misuse is a fairly general term which covers a great 
range of handling. User testing, packaging, and application are 
all part of this category. 

Time and environment dependent mechanisms can generally 
be thought of as failure due to progressive deterioration. This 
deterioration is a function of both time and/or environment. 

It is thought that this type of deterioration is inherent in 
device design and material, but can be introduced or influenced by 
the improper performance of some manufacturing steps. The gross 
effects or modes associated with these mechanisms can either be 
catastrophic or those associated with parametric degradation. 
However, as with the case of quality defects, the effects of time 
dependent mechanisms can be reduced in frequency with the 
application of rigorous screening and derating. 

From reference 45, a review of analysis was conducted on 
some 300 random failures. Those included equipment malfunctions 
from the proceeding causes. The number of equipment malfunctions 
produced by quality defects was greater than 100% of those associated 
with time and environment. Also, the defect / time - environment ratio 
appeared to be independent of whether parts were screened or not. 

It is important to note from this study that defect related failures 
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dominate over time - environment mechanisms during the initial 
operating time. This conclusion is utilized in the rationale 
formulation presented in Section 2. 0. A further result is that 
screening mainly reduces the total number of potential failures. 

In this study the escape rate for screening quality and 
time / environment mechanisms was approximately the same. 
However, this does not mean that the tests for defect as opposed 
to time /environment mechanisms are equally effective. There is 
still much work needed in the area of screening time /environment 
mechanisms, particularly those associated with greatly extended 
applications. 

This type of escape rate information as relates to defects 
is later utilized as part of the prediction rationale presented in 
Section 2.2. 

The discussion information or failure reference (45) 
indicates that gross quality defects appear to be a dominant and 
immediate problem. This implies that defect modeling should be 
the first priority with regard to reliability prediction modeling. 
Judging from the data presented here, care must also be taken when 
selecting data t o be used with such a model. Data for the initial 
phases of model definition (i. e. parameter determination) should be 
extracted from IC fabrication procedures and analysis v/hich are 
directly related to the major problems of bonding, metallization 
and surface effects. Subsequent to this less dominant problems can, 
if necessary (and/or feasible), be treated. 
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1. 1. 3 Mode and Mechanism Correlation 

Table 1 is an extension of the previous data. It was 
constructed as an attempt to compile and relate, in more detail, 
the various forms of failure information. This information was 
obtained from the literature, and wherever possible it was related 
in terms of cause, failure indicator, mode, mechanism and related 
remarks. 


JPL Technical Memorandum 33-514 



1. 1.4 Mechanisms Inherent Within the Technology 

A formal mechanism definition is suggested in Section 
1. 3. 2. However, this section will concern itself with mechanism 
examples inherent within present technology. The intent of this 
section is first to present types of mechanisms which could be 
considered for modeling. In addition, a compilation of specific 
mechanism factors which directly relate to reliability modeling 
will be presented (when such facts are available). General 
mechanism descriptions and the associated modeling techniques 
are not now completely defined. Therefore, the overall objective 
of this section will be to document from the literature some of 
what is now understood, It is hoped that documentation of this 
sort will ultimately be used to check the results of analytic 
formulations. 

The ultimate goal for mechanism modeling is to attain 
enough detailed understanding of failure mechanisms so that 
methods for predicting time-to-failure due to such influences can 
be developed (41,42). This provides for a consistent quantative 
assessment for all devices which may be subject to the same 
mechanisms . 

In addition, this understanding of mechanisms may be 
applied in other ways. First, and at the most gross level, it can 
be used to avoid devices which have mechanisms that prevent the 
device from being used successfully in a particular application. 
Second, knowledge of specific mechanisms can be used to accom- 
modate those mechanisms which are inherent in a particular device. 
For example, this may take the form of increased metallization 
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thickness (where feasible) to accommodate a mechanism such 
as electromigration. 

In addition to the electromigration mechanism just 
mentioned, other types of mechanisms are listed below and are 
subsequently discussed. The discussion will be in terms of the 
following factors: 

(1) Method of mechanism observation 

(2) Type of mechanism caused damage 

(3) Location of damage 

(4) Activation energies (where possible) 

(5) Impact on reliability 

The following mechanisms may occur within integrated 
devices in one or more of the following catagories (41,42) 

(1) Electromigration 

(2) Electric field enhanced diffusion (junctions and dielectrics) 

(3) Electrolytic corrosion 

(4) Radiation damage 

(5) General thermal degradation 

Table 2 presents a summary of some mechanism information 
obtained from the literature. This table is not to be considered 
complete. It is merely intended as an overview of factors which 
modeling techniques must attempt to deal with. 
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This portion of this section will discuss the items from 
Table 2 (in order) in terms of their relation to device reliability. 
Based upon what is currently available in the literature, electro- 
migration is felt to affect reliability in the following ways: 

(1) Large grain [Cc &/**•) films exhibit a longer mean time 

5 

to failure (MTF) than do smaller grain ( rfc 2 y* ) sizes. 

5 

(2) Glass overcoating appears to increase MTF. 

(3) Temperature gradients and mic rostruc tural inhomogen- 

5 

eities are important as limiting factors on MTF. 

(4) There appears to be an initial decrease and then 

saturation of median times to failure and (Log normal) 

2 

standard deviation with increased stripe length.'' 

(5) Life time increases linearly with increase of stripe 

width. ^ 

An example of the work done on diffusion mechanisms can be 
found in reference 8. This particular study was done for basic short- 
term diffusion processes. The basic assumption underlying this 
research was that diffusion coefficients of donor and acceptors be 
independent of donor-acceptor concentrations. The result was a 
formulation describing junction broadening kinetics given that the 
initial donor-acceptor concentrations are known. The basic 
problem with this kinetic description is related to increased time. 
When longer times are involved, the corresponding time development 
of the donor-acceptor composition profile at the junction must be 
known. Implicit in this uncertainty in long time base modeling there 
is an inability to adequately describe reliability over a life time 
base. An application of this technique to modeling seems feasible 
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providing individual device voltage and temperature variations can 
be related to the time dependency of reliability. 

By far the most complicated mechanism- reliability 
relation is that concerning radiation environments. Some of the 
conclusions currently present in the literature are well summarized 
by reference 19. Some comments are reproduced here for conven- 
ience, however, further work will not be attempted since it is 
beyond the scope of this and subsequent reports. 

From the study, it can be concluded that: 1) for the present 

state of the art many active components will be seriously degraded by 
radiation during interplanetary missions, 2) in many cases data is 
inadequate to do more than make gross estimates of degradation of 
part type performance, 3) data evaluating proton damage is not 
available for many part types, 4) for most part types hardening and 
screening procedures are not known or are in a developmental state, 
5) although part degradation can be estimated for each environmental 
component, there is no data indicating how to assess the total de- 
gradation due to combined environments, and 6) using currently 
available data, system reliability in a radiation environment would 
be difficult to assess, particularly for part types for which the 
radiation levels are near the threshold for damage. Even methods of 
assessing such damage needs to be more fully explored. 

The recommendations reachedin reference 19 are as follows: 

1 ) evaluation testing be performed to obtain data on part types where 
no data exists and that lack of data is significant (these cases are noted 
in the report), 2) that testing in combined environments be performed 
to obtain insight into how to as sess the total threat to parts in interplan- 
etary mis s ions, and 3 ) that methods of as sessing reliability of irradi- 
ated components be more fully explored. 
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1. 1. 5 LSI Technology and Reliability Assessment 

In general, the facts indicate that as technology has moved 
from the discrete (including transistors) to the integrated, there 
have been definite improvements and advantages in the areas of cost, 
size and weight. However, there has been one major technological 
complication to reliability assessment. This complication has been 
in the area of decreased accessibility of parameters within the 
integrated device. This simply means that only the peripheral param- 
eters are accessible. 

While the above statement holds for both IC and LSI technology, 

it appears that the latter technology will bring some departures. It 

is useful here to state the nature of these highly integrated devices 

types (LSI). Although a precise definition for LSI has not been 

arrived at within the industry, the following items must be considered 

3 

relevant to the topic. 

1. LSI concept is an extension of monolithic IC technology. 

2. High complexity on single silicon substrate (at least 
100 gates per chip). 

3. Component density /complexity requiring at least 
two levels of interconnects. 

Certain changes in processing and design have been under- 
taken to extend standard IC technology to LSI devices. The most 
important of these extensions are in the following areas. 

1. Use of multileveled metallizations 

2. Incorporation of larger hermetic packages 

3. Increased complexity 
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Incorporated in these extensions are the reliability- 
assessment complications brought about by the smaller geometry 
of the LSI. In particular, these complications can be itemized 
as follows : 


1. Increase susceptibility to electromigration and 
mechanical damage. 

2. Higher electric fields between conductors 
(because of closer spacings) resulting in more 
severe migration and corrosion problems. 

3. Increased possibility of parasitic PNP action between 
adjacent diffusions due to closer spacing. 

4. Higher power density and increased thermal dissipation 
requirements. 

5. More precise requirements on mask registration, 
cleanliness and other processing parameters. 

6. Increased likelihood of shorting between closely 
spaced conductors due to conductive particles. 

7. Small geometry elements are difficult to inspect 
visually. 

8. New types and quantities of failure-causing defects 
not experienced by conventional IC s may be seen for 
LSI arrays because of the additional materials and 
processes and complexity. 

9. Large scale life tests are prohibited by cost. 

10- Complex arrays are frequently fabricated at a 

considerably lower production volume. 
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11. Many LSI arrays are fabricated with a reliance on 
recent technological developments which may not 
be proven. 

12. Most available reliability data is on bipolar arrays, 
with little available data for MOS arrays. 


Related to the above limitations, are the application advantages 
and disadvantages which influence reliability assessment. These 

3 

have been summarized in the Microelectronic Device Data Handbook . 
LSI Application Advantages : 

o Fewer part types per application 


o Improved performance 
o Lower equivalent- device cost 


o Smaller equipment size, 
weight, and volume 

o Lower power requirements 

o Improved reliability 
potential 


LSI Application Disadvantages : 

o Application to only repetitive 
circuits limited to those which 
can be handled by. single 
technology. 

o Packaging problems 
o Complicated test procedures 


o Inability to handle large 
power dens ity 


o Mask complexity 

o Coordination between sup- 
plier and system designers 


As Lauffenburger reports, the causes of failure which 
can be expected for LSI can come from two general classes. These 
are: "Those arising from the extension of Standard IC processing 

techniques to LSI components, and those unique to LSI as a result 
of the additional processing required to realize the LSI components. " 
To date, most of the observed LSI failures are those which are 
usually associated with Standard ICs. The new mechanisms observed 
are the ones associated with the multilevel metallizations and 
packaging. 
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1.2 Problems of Quantitative Assessment and Model Verification 

In order to assess and formulate new modeling techniques, it 
is necessary to first understand the shortcomings of existing models. 

Thus it will be necessary for the situation to be modeled to be firmly 
understood. 

Present attempts at defining modeling procedures for integrated 
devices are at present hampered by several important problems. Some of 
of these problems seem to be inherent in the nature of the rapid techno- 
logical advances being experienced in the electronics industry. There is 
first an apparent increase in the general reliability level of integrated 
devices (which hampers assessment). This increase, while apparent, 
is not well defined quantitatively. When one considers complications 
such as lack of part standardization and the lack of a stable technological 
base, modeling and verification of reliability necessarily become 
difficult. In more concrete terms, information which could be used for 
model verification is difficult to obtain,, This is largely because of cost 
and lack of historical data. The historical factor and cost are interrelated. 
The type of data needed for either model verification or model inputs is 
sometimes hard to obtain for particular device types. This is primarily 
due to the large amounts of data needed. For example, it takes 19,447,500 
part hours with one failure to obtain a . 2 x 10”^/hour failure rate at a 90% 
upper confidence. If attempts are made via life test to generate this type 
of historical data, large costs are obviously incurred. 

Attempted model procedures must ultimately consider the con- 
stituents of the situation cited above. First, the changing level of 
technology implies that modeling procedures must be both comprehensive 
and adaptable to new situations. Second, there is also a problem of 

determining the parameters used in formulating a specific model. 
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In relation to the above statement, one should attempt to 
define a criterion for the selection of integrated devices being modeled. 

This basicly means that the device will be representative of 
current design, materials, processing and fabrication practices. 
Furthermore, the device must be of mature design and processes must 
be understood. If model verification is to be attempted it will be con- 
tingent upon selected device types having a substantial history of 
reliability testing and failure analysis. This means that failure modes 
are at least fairly well documented and test data are available as a base 
for failure mechanism analysis. 

Further information regarding device selection criteria for 

39 

modeling was summarized by Vaccaro as follows: 

1. Results obtained from study of specific devices should, 
whenever possible, be examined for applicability to 
similar or related devices to determine what general- 
izations may be made. 

2. Failure mechanisms of principal interest should be those 
which are not detected by screening and/or which are the 
principal determinations of long term reliability. 

3. As a general rule, efforts should not be directed to basic 
studies of material properties or atomic and molecular 
processes in materials, but rather to the application of 
such existing knowledge for analysis of failure mechanisms 
in devices. 

There are basically two methods for prediction model verification. 
There is, of course, the "classical approach" or failure rate school. 

This type of verification entails massive life tests usually at rated 
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conditions. The second and more recent approach "Physics of Failure" 
considers the mechanisms by which devices fail. This is usually done 
via smaller test lots at accelerated stress levels. The object of this 
second approach is best summarized with the following iterative pro- 
cedure : 

1. Failure generated at accelerated level 

2. Failure analysis conducted 

3. Failure modes and mechanisms are determined 

4. Data is extrapolated 

5. Corrective action initiated 

6. The object being to reduce or eliminate failure 
mechanisms 

There are, of course, some unknowns associated with the 

accelerated method of testing. The most important of these can be 

39 

summarized as follows: 

1. There is a need for better understanding of device failure 
mechanisms when designing accelerated tests. 

2. It is essential to determine whether mechanisms other 
than those which prevail at normal use stress levels are 
introduced under higher stress conditions. 

3. Process must be under control so that each iteration comes 
from the same base line. 

4. Only one specific cause of failure can be considered at a time. 
A device with several different processes may exhibit a 
different degradation process. 
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Survey of Modeling Techniques 
1. 3. 1 Factor Models 


1. 3 


In an effort to assess the reliability of solid state 
applications, two distinct modeling approaches have evolved 
The first, of these approaches is empirical in nature, and 
ignores the underlying causes and changes which lead to 
device failure. For convenience this type of modeling will 
be referred to as factor modeling. Example 1 presents 
formulations for some common factor models. 


JPL Technical Memorandum 33-514 



EXAMPLE i 


Reference 1 

Application: Large scale array KOS with little achieved data 

Rationale: Transformation constructed from what is known about bipolar 



Reference 27 

Application: Bipolar integrated circuits 

Rationale: l) Accounts for the current status of microcircuit failure 

knowledge 

2) Simultaneous solution of equations to obtain lambda (\) 
estimates 

3) Considerable use of data acquired at high stress levels 

Model validity - 30,000 hours 

Model: A = 1r Q |\ cc [ t T X' CT + > E XJ*\ t ['nrXrr+XXiJ + Vrr E X w j 

Failure Rates: Pi - Factors: 


A' cx 

= Chip 

Or 

Grade 

External leads 


= Interconnect 


No. Bipolar gates 

^ Temperature 


= Package 


No. Leads 

'tTg Environment 
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Reference 32 

Application: Double or Triple diffused silicon planar devices 

Rationale: l) Factor operation upon base failure rate 

2) Base failure rate assumed to have an Eyring relationship 
to temperature 

3) Model validity 5= 20,000 hours 


Model: 

■ X b \ \ X ) 


Failure Rates : 

Pi - Factors 


for microcircuite in $ per 1000 hours 

V fT (L ? Complexity 


Base rate (function of temperature) 

'tlT Package 



iTl Environment 


Achieved Reliability 

Factor models such as those presented in Example 1 are basically 
inadequate for modeling present state-of-the-art applications for two 
reasons. First, they can not adequately model current devices. This is 
because they do not allow for actual in-process fabrication influences 
relating to new devices used in current designs. These modeling schemes 
may have been adequate for describing the reliability of discrete parts 
used for relatively short applications. However, most usually we are 
restricted to the use of these types of factor models to calculate the 
probability of success for all manner of failure causes. This technique 
does nothing which treats the dynamics of device failure due to 
mechanisms occurring during long applications. 
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1. 3. 2 Modeling via Physics of Failure 

For this discussion two definitions must first be agreed 

upon. These are the definitions associated with the macroscopic 

and microscopic types of integrated device failure. They are 

39 

fairly standard and have been adequately defined by Vaccaro 

Failure: Gross malfunction and/or out of spec, parameter. 

Failure Mode: The outward manifestation of failure 

relating to the terminal behavior of an 
electronic device. 

Failure Mechanism: A theoretical model devised to explain 

at the atomic and molecular level the 
observed failure mode. 

Much work in the area of reliability physics has been done 
by Joseph Vaccaro of the Rome Air Development Center (RADC). 

A large amount of what is presented here is his philosophy as it 
appeared in reference 39. 

The nature of failure mechanisms may be physical or 
chemical or, in some instances, both. As a general rule, they do 
occur in combination. Ideally, we would like to isolate, and 
identify mechanisms individually. The purpose being to study the 
kinetics associated with device failure as a function of device 
composition and configuration under various stress conditions. 

This, of course, is difficult for all but the simplest cases. It is 
therefore the case that the reliability physicist and chemist may 
have to settle for those mechanisms which appear to be dominant. 

Mechanisms with actual transfer or rearrangement of mass 
fit the class of failures called "intrinsic" or "wearout". Examples 

of those might include intermetallic compound formation in ohmic 
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contacts and inte rface diffus ion. These seem to be the most 


tractable to failure mechanism analysis. As solid state devices 
become more reliable and manufacturing processes become more 
automated (i. e. reduction of human error), device reliability will 
more closely approach the "intrinsic" limitations imposed upon it 
by materials, configuration, processing, design and application. 

An average failure rate or model- derived failure rate 
may serve as a useful generic estimate of device reliability during 
early design phases of an electronic system. However, they have 
limitations in the area of reliability for a particular device. The 
physics of failure has a unique advantage in that it yields important 
feedback information which can be applied to the basic device design. 
Thus, an iterative method is provided which not only estimates 
reliability but also aids in improving it. 

The objective of the failure mechanism approach with regard 

to reliability prediction is to attempt to relate changes in device 

parameters to the basic atomic and molecular changes which cause 

device failure. With this approach, reliability improvement or 

assessment is sought via an understanding of activation energies 

and kinetic rate expressions. Theoretical knowledge regarding 

the behavior of materials under environmental and stress conditions 

is required for this approach. 

38 

Vaccaro adequately describes the complex relation 
between device parameters, mechanisms and stress in the following 
manner. 

"Degradation may be essentially continuous, either 

linearly or nonlinearly; it may be discontinuous, either 
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randomly or periodically; its effects may be cumulative 
or noncumulative; or it may even be some combination 
of these possibilities whose proportions change with time 
and/or stress level. The causative mechanisms may be 
inherent to the device, or they may be the result of 
environment, or both. " 

Assessment of change processes in materials at the 
particle level (molecular and atoms) is made possible by using 
quantum and statistical mechanics theory. Arrhenius and Eyring 
transition state theory describes distributions of particles for a 
given energy state via partition functions. Reaction rate expressions 
can be obtained when the distributions of particles are combined with 
other basic parameters (viz. energy and entropy of activation). 
Transitions state theory assumes equilibrium conditions and is 
therefore subject to limitations. This theory is well suited to 
simple chemical reactions in gases and liquids and therefore can be 
extended to similar problems like solid state diffusion. 

The application practicality of any statistical theory of 
reaction rates is limited to elementary type reactions. More 
complex reactions must be broken down into their constituent 
elementary reactions so that the theory may be applied. This 
relegates the theory to a conceptual method which supplements 
other available evidence used for determining rates of reaction or 
postulating mechanisms. 

In the semiconductor device, several mechanisms may be 
acting simultaneously; therefore, the initial conditions of the 
reaction are often unknown. Since mechanism rate behavior 
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depends upon fine structure, geometry, composition, initial 
reaction conditions and reaction rates, prediction of total device 
behavior from the statistical theory of reaction rates is not 
possible. Therefore, attempted kinetic studies of solid state 
reactions are studied via s implified monitoring structures. 

This results in better controlled test conditions and a limited 
number of mechanism-dependent variables. 

The benefits to be gained from a mechanism approach are 
mainly derived from understanding which will ultimately mean 
reliability improvement. In addition to the overall reliability 
improvement, other benefits to be gained may include: 

1. Effective process corrective action 

2. May lead to sounder understanding of accelerated testing 

3. Improved screening techniques 

4. Better process and material controls 

Suppose a correlation is established between a device 

parameter change and some suspected causative failure mechanism. 

Then this information is clearly useful in attempting to eliminate the 

mechanism or learn of its consequences. It is possible that this 

can not be accomplished quantitatively. If such a relation is known 

to exist, the interpolation of any data obtained (viz. log of mechanism 

rate of change log vs absolute temp ) must be handled with 

some amount of caution when attempting to link parameter degradation 

3 8 

with a suspected mechanism. Vaccaro reports that the ultimate 
arbiter in this situation must always be careful physical analysis of 
failed devices to determine the actual mechanisms contributing to 
failure. A factor which contributes to the uncertainty is that the 
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range of activa.ti.on energies of most processes of interest lie in 
a narrow region (less than 4ev). Hence, the scatter in experimental 
data in an Arrhenius plot, for example, may preclude useful 
accuracies from being obtained. Several mechanisms may proceed 
simultaneously - this makes the interpretation of empirical data 
difficult since observation may reflect the resultant of several 
activation energies. Further caution must be exercised since the 
relation between device parameters and mechanisms is not 
necessarily linear. 

This mechanism theory is further complicated by the 
probability of integrated device defects. Although there are 
mechanisms which can eventually cause failure in "perfect" 
devices under certain applications, there is definitely an inter- 
action between defects and mechanisms. Thus, accuracies of time 
to failure prediction will always be limited by the degree of uncer- 
tainty in predicting a defects probable existence, nature and 
distribution. 

In application, mechanism studies readily lend themselves 
to the establishment of upper limits on the times to failure for 
ideal conditions. This information can then be applied via feedback 
to improve basic device design. The estimates for reliable per- 
formance can then be assessed with models relating to the most 
probable defect situations. 

The mechanism approach provides a much needed addition 
to any comprehensive modeling and assessment scheme. As will be 
noted later (section 2. 3), the simplified mechanism feedback method 
can enhance long term device reliability. As will be noted, this 
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method is used because of relative simplicity as opposed to the 
more difficult and less understood (at this point) mechanism, 
failure distribution approach. 

The times to failure distribution of a device population 
is of prime importance to "classical" reliability prediction 
techniques. To date, however, the reliability physicist (mechanism 
approach) has contributed little to this aspect of the reliability 
problem. Many time-to-failure distributions are known and are 
available for use; however, there has not been a significant 
contribution due to the kinetics of the underlying physical degradation 
process . 

Stewart^ and others have done much to relate underlying 

physical mechanisms to time-to-failure distributions. Two of 
34 

Stewart's theorems are stated for reference in this text. These 
should eventually have far-reaching impact on the study of kinetics 
and time-to-failure distributions. 


JPL Technical Memorandum 33-514 


31 



Theorem 1 


Theorem 2 


The dependence of the failure rate due to a partic- 
ular property (failure mechanism) upon a generalized 
stress (e.g. temperature) is not necessarily the 
same as the dependence of the property itself upon 
that generalized stress. 

If two failure mechanisms are acting and one is 
predominant over a given generalized stress range, 
the dependence of the failure rate upon the general- 
ized stress can be affected by the kinetics of the 
subordinate as well as the dominant mechanism and 
the stress dependence of both. 
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Prediction Rationale 


2 . 0 

2. 1 Intent 

This model discussion will proceed on the hypothesis 
that there is a time base distinction between defect and mechanism 
failures. More precisely, there is some initial time in which 
defect failures dominate over failures due to mechanisms. This 
distinction is important since it is becoming increasingly necessary 
to attempt modeling long term applications. 

Because of the time distinction between defect and 
mechanism failures, the prediction rationale formulated in this 
paper will be divided into two independent sections (i. e. a portion 
for defects, and a portion for mechanisms). Implicit in this is the 
assumption that any possible interaction between defects and 
mechanisms is negligible compared to their respective independent 
effects. The defect failure modeling presented here will account 
for the supplier and user influences on device failure. The second 
portion of the proposed modeling scheme will be discussed in terms 
of the mechanisms which are likely to produce failure during long 
term applications. The outcome of this second rationale will be 
directed at accommodating the effects of known failure mechanisms 
during device design. 

For the reasons noted in Section 1, failure data gathering 
for state-of-the-art parts is not representative of the conditions 
involved in long term applications. This implies that modeling 
techniques utilizing only this type of data are incomplete. These 
types of modeling techniques are incomplete for two basic reasons. 
First, time accrual of failure data for these models is not commen- 
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surate with the times involved in longer applications. In addition, 
it is important that the modeling schemes used for long term 
applications account for distinct portions of time during which 
defects and mechanisms respectively cause failure. 

Allowing for the uncertainty of knowing the exact number 
of defects present in a particular device, statistical techniques 
can readily be applied. It, therefore, seems logical that a 
probablistic factor model could be used to describe the initial 
defect dominated failures. 
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2.2 Defect Failure Modeling 

In Section 2, it was determined that integrated device 
defects are predominantly related to bonds, metallizations and 
surface structures. A realistic failure model should, therefore, 
attempt to describe the source of these failure defects. First, 
these defect type failures seem to be most directly related to the 
basic design, materials, processing, and screening techniques 
applied to the fabrication of integrated devices. These influences 
are those which are directly contributed by the manufacturer or 
supplier. The supplier influences on device reliability are 
becoming increasingly more important with the application of 
LSI, MSI devices. Therefore, tight controls and inspection are 
necessary for both reliability assurance and reliability modeling 
information. 

The second major influence on the integrated device is that 
of the user. This influence impacts device quality in the form of 
user handling, screening, and application. It therefore seems 
logical that a defect model attempting to describe device reliability 
should be directly related to both the above influences. 

The literature search conducted revealed the basis for a 

model formulation which has a great deal of latitude in describing 

44 

basic supplier /user influences on device reliability . This 
modeling technique is, at present, not neces sarily more correct 
than any of the others considered (Example 1 Page 24). It does, 
however, present one important advantage - the model is formulated 
in such a way as to be more dynamic than most models of this type. 
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This model is first subject to data which has been derived 
from testing and use conditions. Most important, it is subject to 
manufacturing influences inherent in the production of a particular 
device which may be of interest. 

The models presented in Example 1 (which are fairly 
representative) rely only upon data obtained after device fabrication. 
These are specifically testing and failures resulting from use. 

In the absence of information which can be used to formulate 
a more complicated model, the failure rate relating supplier 
influences can be thought of as being made up from three basic 
additive components. These components are related to fabrication 
materials, device design and quality. These three factors are 
denoted respectively as: X^, Aq* The quality portion is 

to be thought of as the factor which reflects device failure due to 
quality defects. In order to measure the presence of these defects 
for the in-use device, the quality factor must be modified. This 
can be handled by a probabilistic statement regarding the likelihood 
of such quality defects escaping the supplier screens. This quality 
factor and its related probability are particularly important since 
the overriding opinion suggested by the literature is that early 
device failure is defect dominated. 
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2.2.1 Supplier Influences 


The basic supplier model described above and obtained from Ref. 44 
is formulated as follows: 

(!) = + + Q 

Where 

X^ = The base failure rate in appropriate units at some specified 
reference temperature . 

X/^| = Material and process failure rate attributable to basic 
material and process limitations. 

X jj = Design failure attributable to design factors such as complexity. 
Xg = Quality failure rate attributable to quality defects. 

= The probability of defective device escaping through supplier 
reliability inspection and screening. 

2.2.2 User Influences 

There are three basic areas in which the user has significant impact 
upon device reliability. These are in order of importance: 

1. User inspection and screening used to detect quality defects. 

2. Degradation due to application. 

3. Miscellaneous: handling during testing, assembling, general 

deployment . 

Again, the influence of inspection and screening can best be denoted 
probabilistically. However, there must be an accounting of the effect or 
impact of the preceeding supplier screen and tests . 

In order for a basic quality defect to be present at this point, it 
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must escape both the supplier screens and the user screens and inspections . 

Let t e = The probability of defective device escaping supplier 
reliability inspection and screening. 



The probability of defective device escaping user relia- 
bility inspection and screening. 


Then 


p (EDU) = The probability of both the above events occurring, assuming 
E & U are independent . 


By definition 

P(Enu) = 


T’e-Pa ’ (I-'PeMi-KJ 


where the prime values are complements. 

The in-use failure rate can again be thought of as the additive 
combination of , X ^ , and X Q . The quality factor \q must again be 
modified by the probability of its presence in the in-use device. The form- 
ulation for this is as follows: 

(2) X H = [ X M + X £ + p£ A Q -*■ ( I ~ Tjr ) X Q + ( 1 ) ( I ■ ■'Py ) X QJ Kj 

Where X = in-use failure rate 

, & Xq are as in equation (l) 

K_^ = failure factor derating due to application 

= miscellaneous factor for handling, assembling and general 
deployment . 

Equation (2) may be simplified into the following form: 

(3> 
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Investigation is, of course, needed for determining the lambda factors 
and the constants K, , K . However, this does not mean that the values of 
all constants must be determined explicitly. Various terms may in fact be 
negligible. This means that some simplifying assumptions could be made 
which in turn simplify computation and the need for related data. An 
intuitive feeling provided from the literature surveyed suggests that X ^ 
and \ j) from equation (l) may be negligible compared to the quality factor 
Similarly, the K factor from equation (3) may also be quite small. 
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2.3 Rationale and Approach to Mechanism Modeling 


As was noted previously, there are two distinct courses relating to 
device failure over long use periods. These are namely those failures 
associated with defects and mechanisms. We can retain the assumption that 

failure-causing defects are independent of time (after the so-rcalled 
"burn-in"). However, failure mechanisms propagate with time and are thus 
time and/or environment dependent. Therefore, any proposed prediction 
rationale must relate the time base incompatability between failures caused 

by defects and those caused by mechanisms. 

This problem can be treated in several ways. First, one can forget 
about the respective contributions to failure caused by defects and mechan- 
isms. This is done in a time compression caused by accelerating device 
life. These failure effects are demonstrated empirically. This is the basic 
accelerated life test (ALT) philosophy. 

The purpose of this method is to establish an acceleration factor. 

Thus, devices to be used in long applications can be tested to life on a 
compressed time scale. That provides an opportunity to test more devices; 
thus, accruing more data for establishing the expected device life. 

There are, of course, problems associated with this method of reliability 
assessment. Most prominent among these is that of obtaining an accurate 
algorithm for obtaining a life acceleration factor. 

The other problems associated with ALT are those regarding the relation 
between failure mechanisms and mechanism rates of change with respect to time. 

In particular, there is first the problem that assessment of a specific 
failure- causing mechanism may produce an erroneous rate to failure. This 
means that establishment of an acceleration factor is not possible. 
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A second method is for the defect and mechanism influences on failure 


to be handled separately. In this instance, each cause has its own respective 
time- to- failure distribution. For an example, this could mean that defect- 
related failures could be described by the common exponential assumption. 

The new relation would be between failure kinetics and appropriate time-to- 
failure distributions. 

The results of these two cases have a common probabilistic base and 
can thus be combined to obtain the overall probability of device life. The 
result would thus cover the full range of causes which govern device failure. 

Work relating to failure kinetics and time- to- failure distributions can 
be seen in References 34, 35 and 36. This work was prepared by R. 

Stewart of the Lockheed Palo Alto Research Laboratory. His basic premise 
is that "failures are caused. " The basic constraint equation which relates 
failure kinetics and time- to- failure is expressed in equation (4). 


(*) 


Where 


T L 


0 1 


I 

i j 




>?■ 'i 

Ot 


dt 


failure limit Or quantative parameter tolerance (associated 
with a particular device characteristic) 
mechanism(s) rate of change with respect to time 
device characteristic (s ) rate of change with respect to 
mechanism(s ) 

th 

time- to- failure for i characteristic 
Along with this basic premise, there is presented what is called a 
"casual redefinition of failure rate." This is intended to differ from the 
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standard hazard rate formulation vhich is defined in the calculus as the 
limit condition: 

Z(t) = lim R(t) - R( t+h ) 

M h R(t) 

Where R(t) = reliability function 

The Stewart redefinition is formulated in equation ( 5 ) 

(5) T - 1/ T 

i 

F^ = failure rate for i^^ 1 mechanism 

T. = time- to- failure for i^ mechanism 

1 

Given variations in the rates for equation (4) (because of stress/ 
activation energy relation; also variations inherent in the device) a corres- 
ponding variation in the values of T^can be observed. The object then is to 
relate the appropriate distribution^ ) to these time values. Thus, probabil- 
ities of success or failure may be obtained. 

The basic formulation for the average values of F and T^ are given 
in equations (6) & ( 7 ): 

i/ N r i/ t ^(t)dt 

i/ K / t^(t)dt 

number of devices 

a continuous, normalized density distribution function 
average failure rate 
average time to failure 

to be the most promising relations developed by Stewart 


(6) F = 

(7) T = 

Where N = 

P{ t) = 

F 

T 

What appear 
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have heen presented in Section 1.4. However, there have been some criticisms 
leveled at Stewart. Most notable of these are the criticisms of Paul Gottfried. 

The major criticism which is relevant to the development of the modeling 
procedure developed herein concerns the problem of early failures relating 
to manufacturing errors. These errors can not be handled by Stewart's deterministic 
relations. Therefore, this modeling scheme is not complete when one is 
attempting to model the general influences of integrated device failure. 

However, the benefits of Stewart's approach provide excellent reasons 
for more research in this area. This is mainly true because the deterministic 
approach is precisely what is needed to describe the mechanism influence or 
failure during very long applications. The benefits mentioned are twofold. 

First, the effort required to formulate the rate equations presented in 
equation (4) promote physical understanding of materials used for device 
manufacture. The second benefit of Stewart's model is that, ultimately, it 
may provide a practical technique for reliability prediction from the funda- 
mentals. This is especially important when trying to evaluate new technology 
applications to integrated devices. 

A third rationale for reliability prediction is quite similar to the 
one just presented. Again, the failure effects produced by defects and 
mechanisms are treated separately. Defect-caused failures are assessed with 
the formulation presented in Section 2.2. However, the kinetic effects of 
failure mechanisms are limited to their relation to certain device electrical 
parameters. This relation utilizes the same basic formulation of equation (4). 
However, instead of relating kinetics and failure distributions, the primary 
objective of these kinetics studies is directed at basic device design. 
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Work relating to this subject is presented in References IT and 33. 

A more detailed representation of this third rationale is 
presented in the next section (2.4). 
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2.4 Primary Objective for Mechanism Modeling 

Regarding the current level of knowledge with respect to mechanisms 
(Section 2.3) this third rationale seems to be the most appropriate. There 
are two basic reasons behind this statement. First, there is much more 
work needed before a thorough understanding of the relation between mechanisms 
and device failure is obtained. This is a primary requisite before time-to- 
failure distributions can be applied. Secondly, even if one or more mechanisms 
were understood much more information would be required before distributions 
could be associated with any certainty. 

The overall objective is to first determine the probability of 
device failure due to defects. The second objective would be directed at 
minimizing the failure effects due to mechanisms. Because of the basic 
deterministic nature of Stewart's model, this could be accomplished through 
control of basic design of the device concerned. The formulations presented 
in references 17, 25 and 31—36 provide the means for assessing this design 
modification. A feedback loop is thus established between mechanism rates 
of change and the electrical characteristics of interest. 

Supposing that the parameter variation which can be tolerated is 
known. Then, the integrated effect of a mechanism over a specified period 
of time and its associated electrical characteristic change can be compared 
with the known tolerance. Therefore, this method provides the basic means 
by which it can be determined if device design and materials are compatible 
with the proposed application. 

This does not in itself correct the situation by eliminating a mech- 
anism and its associated failure. However, the presence of a specific mech- 
anism is acknowledged and accommodated in such a way as to minimize its 


JPL, Technical Memorandum 33-514 


45 



influence on device failure. 


To reiterate, the basic intent of this rationale as it applies to 
long term applications is twofold. First, the impact of failure- causing 
defects upon device reliability is accounted for in a way which relates in 
standard fashion to the probability of success/failure. Second, the known 
failure-causing mechanisms are considered in a way which reduces the chance 
of failure occurrence related to a specific mechanism within the time allotted 
for the application. Again, this second objective is accomplished through 
basic design and material (if available) changes. 

The problem areas are the same as those associated with Stewart's 
rationale. They are three in number and are directly related to formulation (4). 
First, isolation of the most frequent or typical failure- causing mechanisms 
and their associated activation energies must be understood. Once activated, 
the rate of mechanism change with respect to time must be determined. A 
related problem is the determination of the device parameter change with 
respect to a specific mechanism. 

Theoretically, each of the three preceeding proposals could be of 
value as a prediction rationale for extended applications. Some of these are 
more complicated and estheticly pleasing than the others. However, regarding 
the present state of knowledge, it seems logical that a simple formulation 
which also provides a basis (with additional work) for a more comprehensive 
and accurate model is the best starting point. 

It must be agreed that the three-way interaction between acceleration, 
mechanism isolation, and mechanism rate of change is extremely complicated. 

It is not at all clear when sufficient data will be available for correla- 
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tion and definitions of the interactions of these three factors. Therefore, 
it seems that the ALT approach to a failure prediction rationale is not 
compatible with present available data. It also seems many steps removed 
from the initial phases necessary for understanding the prediction process. 

The remaining two alternatives presented here have one important 
similarity. This is in regard to mechanism rate of change and its corresponding 
relation to electrical characteristics and activation energies. 

The major complication arises when one tries to associate failure 
mechanisms to statistical distributions. Indeed, there is some question as 
to whether this is even necessary. A comprehensive understanding for current 
devices, and future devices, can be attained along with the development of . 
the design feedback idea. 
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2.5 


Defect Model Implementation. 


2.5.1 Probability Determination 

The basic defect model formulation given in Section 2.2 (equations 
1, 2, and 3 ) is a factor operation upon parameters X ^ >\ > an( i X q> • 

Because of time constraints, determination of values for the above lambda 
parameters will be the concern of future work. Ultimately the objective 
of this work will be to provide realistic and consistant values for lambda 
parameters associated with particular device types. The subject of this 
section will concern the work which has been done toward the determination 
of the probabilities stated and defined in equations (l), (2), and ( 3 ). 

In particular a computational rationale for obtaining the. value of P^, will 
be presented. (P defined below) 

It was determined in Sections 1.0 and 2.0 that quality defects were 

the major concern relating to early device failure. Therefore, it is important 

to determine an estimate for the probability associated with the quality 

parameter • The probabilities referred to here are denoted by P^ and 

P where P represents the probability of a defective device escaping a 
U E 

supplier reliability inspection and screening. P y is the probability of a 

defective device escaping user reliability inspection and screening. 

A computational procedure will be defined and an example of P will 

E 

be given. Examples for P^. will not be given here since this probability 
calculation can be done in a manner similar to that given for P . 

Consider an arbitrary supplier type screening procedure. Suppose, 
for convenience, that this particular procedure is set up for the detection 
of some specific device defect ( d ^). (where i represents a specific type 
of defect. ) The devices entering the screen can be thought of as being good 
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or defective. The determination of the screening procedure will either 
indicate that the device is good or bad. In the case of the former the 
device will escape while the latter will be rejected. 

Probabilities may be associated with the quality of the entering 
devices. For the good devices, the probability will be denoted as P(g). 

The defective probability for entering devices will be defined as P(d). 

Since the screening procedure was not assumed to be perfect, there are 
probabilities associated with an incorrect screening determination. These 
probabilities are: 

(1) The device is rejected when it is actually good, i.e., given 

that the screened device is good (g), the screen determines 
that it is not accepted. The probability associated with this 
situation will be denoted as: P(E C / ) 

o 

(2) The device is accepted (or escapes) when it is defective, 

i.e., given that the screened device is defective (d), the 
screen determines that it is accpeted (escapes). This prob- 
ability will be denoted as: P(E/ ). 

d 

The other probability that is associated with good devices escaping 
screening is: P(E/g). The probability of defective devices not escaping 

is: P(E c /d). The screening procedure and respective probabilities described 

above is represented in Figure 1. 
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P(E/g), P(E C /g) and P(E C /d) are, in total, a description of the quality of 
a specific screen. Estimates for these probability values ecu Id be obtained 
from screening yields. A simple example of these calculations is given in 
Section 2.5.k. 

From the information presented in figure 1, the probability of 
having a defective device in the lot that escaped screening may be deter- 
mined. This relation will be denoted as P(d/E). It can be calculated 
via the following formula: 

(8) P(d/E) = P(d)P(E/d) 

P(d)P(E/d) + P(g)P(E/g) 

(9) P(g/E) = 1 - P(d/E) 

Suppose that a screening procedure is considered which has a number 
of successive steps like that of Figure 1. Formula (8) can then be applied 
at the end of each stage, thus yielding successive values for P(d/E). This 
will then become an iterative arrangement in which the P(g/E) and P(d/E) of 
one stage will respectively become the P(g) and P(d) of the next successive 

screening stage. The value of P(d/E) after the last screening stage will then 

be the overall probability of having a defective device escaping at each 

intervening stage. Therefore, P(d/E) and. the P of equations (l) (2) and (3) 

E 

are equivalent. This is illustrated in Table !+ . 
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The probability values not determined by data or calculated from 
formulas (8) and ( 9 ) are the initial P(g) and P(d) values denoted in Figure 1 . 
The probabilities P(g) and P(d) will be referred to as "prior" probabilities- 
They are associated with some appropriate prior distribution. This prior 
distribution may be initially unknown to the extent that only crude estimates 
of P(g) and P(d) can be determined. However, as more data becomes available 
better estimates for P(g) and P(d) can be made. The combined prior estimates 
and subsequent iteration forms a Bayesian technique for the estimation of P^- 

2.5,2 Assumptions and Notation 

In order to make the calculations described in Section 2.5.I, it is - 
first necessary to consider the screening procedure model shown in Figure 2 . 
The subsequent discussion will describe the assumptions made regarding this 
procedure . 
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From the discussion in Section 1.1, it was determined that supplier/ 
user cooperation was necessary to achieve quality devices. This is particu- 
larly true for MSI, LSI devices. The notation for a screening procedure 
P , (k=l,2) could reflect this type of supplier/user relation. For this 

reason it was assumed in this section that screening procedures of the same 
type (i e-, each i) were done in two stages. One stage (k = l) by the supplier, 
the other stage (k = 2) by the user. This may not be practical in all cases, 
but in any event stage 2 is assumed to be identical to stage 1. This is for 
the purpose of estimating the number of defects escaping from stage 1- 

A second assumption regarding the calculations presented here applies 

. __ t il 

to the efficiency of both of the i procedures. It was assumed for the i 
procedure that stage 1 and stage 2 were equally efficient in terms of detecting 
defects. This implies that the expected yield of good and defective devices 
will be of the same proportion for each stage of the I th procedure. 

2.5-3 Iteration Algorithm 

This algorithm is the formulation associated with the procedure model 

given in Figure 2. The steps are represented in terms of P(d/E) since P(g/E) 

is merely the relation given in formula (9)- For the devices escaping (E) 

tii 

the i procedure, k stage, P^^d/E) = the probability of having a defect (d) 
among those devices which escape (E). The P^^CE/d). P^^CE/g) values are 
all calculated from the estimates of the good./def ective yield from (See 

Table 3, Section 2.5.4) 

P 1;L (d/E) = P^d) P r:L (E/d) 

p i'i (d) p i’i (E/d) + p i i (g) 

where: P^,^(d) P .^(g) = initial prior probabilities 
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P 1 , 2 (d/E) = P 1 , 2 (d) P^E/d) 

P 1 , 2 (d) P^gCE/d) + P x 2 (g) P i> 2 (E/g) 

where: P , (d) - P , (d/E) + T> 4 | (d)[l - ( *V + S '^ - 

*- d 11 1,1 L K+<=< 

p i’a (g) ‘ 1 ‘ P . 'i (d/B) -V d {l 

K +■ c<- 

P 2 ^d/E) * P ^(d) Pg^CE/d) 

P 2 > l(d) Pg^CE/d) + P^g) P 2 , 1 (E/g) 


where: Pg^Cd) = P^^d/E) + P^U) 

P 2 , 1 (g) = 1 - P^jg(d/E) - P^Cd) 


N i’k 58 1116 numl::>er ° ^ defects detected in procedure i, stage k. 
B i’k 31 Q. uan ' fca ' fciv ' e error for 

K = The total number of defects detected for all screening 
procedures . 

= The quantative error for k. 


P 22 (d/E) 


P 2,2 (d) P 2,2 (E/d) 


P 2, 2 ' d > P 2,2< E/d > + P 2,2<®> P 2,2 (E ^) 

where: Pg g(d) 3 Pg^d) = P 2 jl (d/E) 

P 2 ^ 2 (g) = 1 - p 2 A (d/E) 


general term 
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P l,l (d/E) 


P 12 (<i/E) 


■ P l,l (4) P l,l (E/d) 

P t,l U) P l,l (E/4> + ? 1A (S) P l,i E/g) 

♦ where: P jL ^(d) = P^^d/E) + P l,l/ d ^ + ^ 


I ~/j7j ^ j,k)) 


K ■+■ c*. 


i z 


p i,i (g) ’ 1 - P i-I,a (d/E) + (Nj, k +3^ 

P l, 2 < d > P i, 2 (E/d) 


K + c* 


P i,2 (d) P l, 2 (E/d > + P i )2 <*> P i, 2 < E / g) 
where: P ± 2 (d) “ P i i( d / E ) 


P ij 2 (g)= i-? 1A (a/E) 
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* NOTE: 


The extra factor for P -(d) and P (g) is only included after 

1 ? X XfX, 

til 

the two stages of the i procedure are completed. This factor 

represents the proportion of the initial defects (given by P^ ^(d)) 

which remain and require screening. It is included because each 
th 

pair of i procedures screen for the same specific type of defect. 

Therefore, a portion of P (d) remains after P _ is completed. 

1,1 i,c 

The factor is derived from a simple linear degradation which 

computes the portion of P (d) remaining after each two screens. 

1>1 

For this reason it may be noted from Table 4 that P^ ^(d/E) is 
in every case (except for the last) larger than the preceding 
P i • The last case is tlie exception because c< and (3 (see 

below) were assumed to be negligiable. Therefore, as far as the 
computations are concerned, there are no more different types of 
defects remaining after the last stage. 

For and (3 : 

No data was available so that estimates for and^<5 could be made. 
Therefore, error terms were assumed to be negligiable for the 
example presented in Section 2.5.4. Estimates for ^ and (3 can be 
obtained from analyses which relate the appropriate cause of device 
failure to the screening procedure which neglected to detect the 
cause in question. 

2-5-^ Example Calculation 

The data for the calculations presented here was obtained from JPL, 

MM 71 parts procurement information. These calculations were made to illustrate 
the algorithm presented in Section 2.5.4. Only limited data was considered, 
however. More data is available and will be included in the iteration algorithm 
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at a later date. 

Consider the following screening example. The notation used will 
he the same as that stated in Sections 2.5*2 and 2.5*3* Seventy devices are 
entering at P _ * Other quantative estimates assocaited with Figure 3 are 
contained in Table 3. 



'Pi 3 
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The calculation of P , (E/d) and P . (E/g) used in the algorithm of 
Section 2-5-3 is explained in the following discussion. 


\V- 

' l 5 1 

L 5 2 

Ni,i 

Ni,2. 


Consider the procedure 


Assumptions: l) The number of defects in P. _ (i.e., If _) is an estimate 

for the number of defects which escape P .. without detection. 

I, X 

2) P and P have equivalent efficiency with respect to 
i t*- 1,2 

the detection of defective devices. This implies that the 

proportion of good/def ective devices for E^ g is the same 

as for E . 

i,l 

Example: D + G. , ■ E 

1^1 i;l 


where D 


i,l 1,2 


p i, k < E/a > 


D i,2 ’ °i,2 
where D. „ 






P lik (E/g) = 


The probability of a device escaping P . given that it is 

i jiK 

defective . 

An estimate for P. . (E/d) is given by the ratio of defects out 

i,K 

of P. . to the total number of devices entering P . 

The probability of a device escaping P given that it is good. 

i,k 

An estimate for P. . (E/g) is given by the ratio of good devices 
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out of P to the total number of devices entering P . . 

1 jK 


Example: 


for P i,l 
P i,l (E/d) 


= D 




E 


i-1,2 


p i,i (E/s) " g ja 

E 


i-1,2 


WhereG i,l“ E i,l' D i,l 


A simple computer program was written for the algorithm presented in 

Section 2 - 5 - 3 - The qualitative estimates presented in Table 3 were used to 

make the computations- A range of four prior probabilities (P -(d) and 

l,i 

Pi -.(g)) were selected, since no specific estimates for P .(d) and P (g) 

* 1^1 1^1 

could be made with the limited data presented here. The results of these 
computations are presented in Table 4 - 
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The screening for this example ends after four procedures (i.e.; 
therefore the rationale provided in Section 2.5-1 applies, (i.e., ^(d/E) = P^) 

The probability P_ may now be substituted into equations (l), (2) and (3)« 
Completion of the defect model is then contingent upon the determination of 
suitable lambda values for the device in the example. 
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3 . 0 CONCLUDING REMARKS 


The first sections of this report, prior to the prediction rationale, 
vere included as a means of documenting information relevant to the under- 
standing of failure modeling. One of the most important factors stated in 
these sections, concerning modeling techniques, is that of model application. 

It is now becoming increasingly important that integrated device failure 
models be amenable to long term applications. In general this can be defined 
as being between eight to twelve years. 

The information researched indicates that existing modeling schemes, 
either factor or mechanism, are by themselves inadequate for the longer device 
applications currently proposed. The following explanation will help to 
clarify this statement. Device failure can be caused by a combination of 
defects and mechanisms. An example of this would be an instance where a 
failure due to electromigration was hastened by the presence of a metallization 
scratch (i.e., defect). However, when longer device applications are considered, 
it is conjectured that failures will be more likely to be caused from either 
defects or mechanisms acting independently. This means that the initial device 
failures will be primarily due to defects while the latter failures can mostly 
be associated with mechanisms. This separation implies that failure due to 
defect-mechanism interaction is small compared to their independent influences. 
These reasons form the basis for selecting a modeling scheme which is comprised 
of two components, one for defects, the other for mechanisms. 

The hours associated with the above demarcation are not precisely 
known. Information obtained from the literature seems to indicate that the 
estimate is between 30,000 to 60,000 hours. If the above conjecture is 
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correct, this time estimate must he further refined if correct application 
of a modeling scheme is to he gained. For example, if the standard exponential 
assumption relating to defects (i.e., constant failure rate) is applied, the 
time period during which defect failures occur must he estimated. To do 
otherwise would apply the exponential assumption to not only the defect portion 
of time hut also to a time period which may he dominated hy mechanism failures. 
This means that mechanism failures are being modeled via a constant failure 
rate assumption. This constant assumption is not applicable to failures 
caused by mechanisms. 

The failure mechanism approach covered in Section 2.3 and described by 

40 

Vaccaro and others appears to he a ueeful way to analyze the mechanism 
degradation process. However, there are problems associated with this approach. 
As reported hy Vaccaro in reference 40, there is a need for better detection, 
identification and characterization for causes of failure. These are both 
macroscopic and microscopic, either process induced or intrinsic. There must 
also he a correlation with device reliability. In addition, there is a need 
for more closely integrated reliability team efforts which combine the insight 
into materials of the physicist, chemist and metallurgist along with the 
knowledge of the reliability engineer and parts specialist. 

It is felt that at the present stage mechanism models are best compatible 
with reliability assessment in the area design and materials influence. 

This is because there is not enough theory and empirical information available 
to specify the probability of device failure during a certain time, due to a 
specific mechanism. The formulations given by Stewart and Vaccaro and presented 
in Section 3*0 do not provide an adequate base for this probabilistic method. 
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However, they are examples of the type of work which can be done and how it 
applies to device reliability assessment. 

There is a particularly important feature implicit in the procedure 
presented in Section 2.0. This feature is adaptability. In this instance, 
adaptability refers to the ability to draw as much reliability information as 
is possible from the inherent qualities of the device types being modeled. 

The same would be true for the processes and screens associated with the 
device. Changes in the technology or processes relating to the fabrication 
of the device thus directly impact the model. In this way an automatic 
model update is provided. The portion of the model implemented in Section 2.5 
is adaptable since it relies heavily upon factors unique to the devices under 
consideration. These factors are derived from source and user screens - 
screens which check for the quality associated with the devices. 

Once the mechanism formulations are refined to a state which is more 
practical for use, they too will be adaptable to changing device technology. 

This is true since by definition (Section 1.5) we are talking of models which 
explain the atomic and molecular level of device behavior. 

The continuing efforts scheduled for Fiscal Year 72 will be directed 
at refining and verifying (where possible) the rationale proposed in Section 4.0. 
The specific areas of concern are listed below. 

1. Estimation of defect model lambda (A ) parameters for limited 
types of devices 

2. Investigation of defect and mechanism caused failures as a 
function of time 

3 . More detailed use of available screening and inspection data 
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Investigation of environmental factors relating to the proposed 
model 

Incorporate new mechanism formulations as they become available 
Verification of modeling scheme using simple examples 
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TABLE 3 


Procedure/Stage (P, , ) 

1,K 

T 

V - 

j?,. 

%,z 

% 

V 

r 3 ,2 

% 

ft, 2 


Defects Detected k ) 

15 

13 

5 

3 

3 

3 

10 

3 


Defects 

Escape Estimate (D. , ) 

1,K 

13 

10 

1 

1 

3 

3 

3 

3 


Good Devices 
Escape Estimate (G. , ) 

1 ,K 

k2 

32 

36 

35 

30 

27 

17 

14 


Total Escape (E^ 

55 

k2 

37 

36 

33 

30 

20 

17 



*These entries are single values, more data is available and will be analyzed 


at a later date. 
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